12 research outputs found

    Stochastic Optimization Algorithms for Problems with Controllable Biased Oracles

    Full text link
    Motivated by multiple emerging applications in machine learning, we consider an optimization problem in a general form where the gradient of the objective is available through a biased stochastic oracle. We assume the bias magnitude can be reduced by a bias-control parameter, however, a lower bias requires more computation/samples. For instance, for two applications on stochastic composition optimization and policy optimization for infinite-horizon Markov decision processes, we show that the bias follows a power law and exponential decay, respectively, as functions of their corresponding bias control parameters. For problems with such gradient oracles, the paper proposes stochastic algorithms that adjust the bias-control parameter throughout the iterations. We analyze the nonasymptotic performance of the proposed algorithms in the nonconvex regime and establish their sample or bias-control computation complexities to obtain a stationary point. Finally, we numerically evaluate the performance of the proposed algorithms over the two applications

    Riemannian Stochastic Gradient Method for Nested Composition Optimization

    Full text link
    This work considers optimization of composition of functions in a nested form over Riemannian manifolds where each function contains an expectation. This type of problems is gaining popularity in applications such as policy evaluation in reinforcement learning or model customization in meta-learning. The standard Riemannian stochastic gradient methods for non-compositional optimization cannot be directly applied as stochastic approximation of inner functions create bias in the gradients of the outer functions. For two-level composition optimization, we present a Riemannian Stochastic Composition Gradient Descent (R-SCGD) method that finds an approximate stationary point, with expected squared Riemannian gradient smaller than Ļµ\epsilon, in O(Ļµāˆ’2)O(\epsilon^{-2}) calls to the stochastic gradient oracle of the outer function and stochastic function and gradient oracles of the inner function. Furthermore, we generalize the R-SCGD algorithms for problems with multi-level nested compositional structures, with the same complexity of O(Ļµāˆ’2)O(\epsilon^{-2}) for the first-order stochastic oracle. Finally, the performance of the R-SCGD method is numerically evaluated over a policy evaluation problem in reinforcement learning

    Stochastic Composition Optimization of Functions without Lipschitz Continuous Gradient

    Full text link
    In this paper, we study the stochastic optimization of two-level composition of functions without Lipschitz continuous gradient. The smoothness property is generalized by the notion of relative smoothness which provokes the Bregman gradient method. We propose three Stochastic Compositional Bregman Gradient algorithms for the three possible nonsmooth compositional scenarios and provide their sample complexities to achieve an Ļµ\epsilon-approximate stationary point. For the smooth of relative smooth composition, the first algorithm requires O(Ļµāˆ’2)O(\epsilon^{-2}) calls to the stochastic oracles of the inner function value and gradient as well as the outer function gradient. When both functions are relatively smooth, the second algorithm requires O(Ļµāˆ’3)O(\epsilon^{-3}) calls to the inner function stochastic oracle and O(Ļµāˆ’2)O(\epsilon^{-2}) calls to the inner and outer function stochastic gradient oracles. We further improve the second algorithm by variance reduction for the setting where just the inner function is smooth. The resulting algorithm requires O(Ļµāˆ’5/2)O(\epsilon^{-5/2}) calls to the stochastic inner function value and O(Ļµāˆ’3/2)O(\epsilon^{-3/2}) calls to the inner stochastic gradient and O(Ļµāˆ’2)O(\epsilon^{-2}) calls to the outer function stochastic gradient. Finally, we numerically evaluate the performance of these algorithms over two examples

    Geodesic gaussian processes for the parametric reconstruction of a free-form surface

    Get PDF
    Reconstructing a free-form surface from 3-dimensional (3D) noisy measurements is a central problem in inspection, statistical quality control, and reverse engineering. We present a new method for the statistical reconstruction of a free-form surface patch based on 3D point cloud data. The surface is represented parametrically, with each of the three Cartesian coordinates (x, y, z) a function of surface coordinates (u, v), a model form compatible with computer-aided-design (CAD) models. This model form also avoids having to choose one Euclidean coordinate (say, z) as a ā€œresponseā€ function of the other two coordinate ā€œlocationsā€ (say, x and y), as commonly used in previous Euclidean kriging models of manufacturing data. The (u, v) surface coordinates are computed using parameterization algorithms from the manifold learning and computer graphics literature. These are then used as locations in a spatial Gaussian process model that considers correlations between two points on the surface a function of their geodesic distance on the surface, rather than a function of their Euclidean distances over the xy plane. We show how the proposed geodesic Gaussian process (GGP) approach better reconstructs the true surface, filtering the measurement noise, than when using a standard Euclidean kriging model of the ā€œheightsā€, that is, z(x, y). The methodology is applied to simulated surface data and to a real dataset obtained with a noncontact laser scanner. Supplementary materials are available online

    Riemannian Stochastic Variance-Reduced Cubic Regularized Newton Method

    Full text link
    We propose a stochastic variance-reduced cubic regularized Newton algorithm to optimize the finite-sum problem over a Riemannian manifold. The proposed algorithm requires a full gradient and Hessian update at the beginning of each epoch while it performs stochastic variance-reduced updates in the iterations within each epoch. The iteration complexity of the algorithm to obtain an (Ļµ,Ļµ)(\epsilon,\sqrt{\epsilon})-second order stationary point, i.e., a point with the Riemannian gradient norm upper bounded by Ļµ\epsilon and minimum eigenvalue of Riemannian Hessian eigenvalue lower bounded by āˆ’Ļµ-\sqrt{\epsilon}, is shown to be O(Ļµāˆ’3/2)O(\epsilon^{-3/2}). Furthermore, the paper proposes a computationally more appealing modification of the algorithm which only requires an inexact solution of the cubic regularized Newton subproblem with the same iteration complexity. The proposed algorithm is evaluated by two numerical studies on estimating the inverse scale matrix of the multivariate t-distribution over the manifold of symmetric positive definite matrices and estimating the parameter of a linear classifier over Sphere manifold. The proposed algorithm is also compared with three other Riemannian second-order methods

    Geodesic Gaussian Processes for the Parametric Reconstruction of a Free-Form Surface

    Get PDF
    <p>Reconstructing a free-form surface from 3-dimensional (3D) noisy measurements is a central problem in inspection, statistical quality control, and reverse engineering. We present a new method for the statistical reconstruction of a free-form surface patch based on 3D point cloud data. The surface is represented parametrically, with each of the three Cartesian coordinates (<i>x</i>, <i>y</i>, <i>z</i>) a function of surface coordinates (<i>u</i>, <i>v</i>), a model form compatible with computer-aided-design (CAD) models. This model form also avoids having to choose one Euclidean coordinate (say, <i>z</i>) as a ā€œresponseā€ function of the other two coordinate ā€œlocationsā€ (say, <i>x</i> and <i>y</i>), as commonly used in previous Euclidean kriging models of manufacturing data. The (<i>u</i>, <i>v</i>) surface coordinates are computed using parameterization algorithms from the manifold learning and computer graphics literature. These are then used as locations in a spatial Gaussian process model that considers correlations between two points on the surface a function of their <i>geodesic</i> distance on the surface, rather than a function of their Euclidean distances over the <i>xy</i> plane. We show how the proposed geodesic Gaussian process (GGP) approach better reconstructs the true surface, filtering the measurement noise, than when using a standard Euclidean kriging model of the ā€œheightsā€, that is, <i>z</i>(<i>x</i>, <i>y</i>). The methodology is applied to simulated surface data and to a real dataset obtained with a noncontact laser scanner. Supplementary materials are available online.</p
    corecore